This page contains exploratory analysis for the Bongo net data from JSOES. Please note that because the Bongo net tows occur during daylight hours, the tow composition is biased towards taxa/life stages that do not vertically migrate.

Top ten taxa by mean density.
genus_species common_name mean_density_per_m3 sd_density_per_m3
Euphausiidae Euphausiidae 117.5 703.9
Cirripedia Barnacles 34.8 757.1
Chaetognatha Chaetognatha 3.5 16.1
Euphausia pacifica Euphausia Pacifica 3.5 29.4
Mitrocoma cellularia Mitrocoma Cellularia (Cross Jellyfish) 3.4 29.9
Limacina Limacina 3.0 23.3
22 division fish egg 22 Division Fish Egg 2.9 8.6
Engraulis mordax Northern Anchovy 2.5 25.1
Calanus marshallae Calanus Marshallae 2.4 11.8
Neotrypaea californiensis Bay Ghost Shrimp 2.4 159.8
Top ten taxa by frequency of occurrence.
genus_species common_name prop_samples
Euphausia pacifica Euphausia Pacifica 0.83
Euphausiidae Euphausiidae 0.80
Thysanoessa spinifera Thysanoessa Spinifera 0.76
Calanus marshallae Calanus Marshallae 0.70
Cancer oregonensis/productus Cancer Oregonensis/Productus 0.70
Chaetognatha Chaetognatha 0.70
Themisto pacifica Themisto Pacifica 0.66
Cirripedia Barnacles 0.64
Limacina Limacina 0.49
Crangonidae Crangon 0.46

To demonstrate potential analyses that could be applied to different taxa, we will focus for now on copepods. In the JSOES Bongo data Calanus marshallae is the most abundant cold-water copepod and Calanus pacificus is the most abundant warm-water coepod. To demonstrate how we can visualize the abundance of these two key copepod species in space and time, we plot their density in survey catch below.

By taking the mean log density across the survey region, we can also create a crude index of abundance.


From the maps and the indices of abundance, we see that Calanus marshallae is usually more abundant than Calanus pacificus, except in warm years (e.g., 2015-2017, which aligns with the Blob).

Temporal and Spatial Autocorrelation

Before fitting any spatiotemporal models, we must explore the spatial and temporal autocorrelation in the data.


Temporal structure

We can first inspect the autocorrelation in our Calanus marshallae and Calanus pacificus mean annual time series.

At the coastwide scale the Calanus marshallae time series does not show temporal autocorrelation, but the Calanus pacificus time series does at a lag of one year.

We can also inspect temporal autocorrelation at the scale of individual stations. To reduce the number of individual ACF plots, I have instead plotted histograms showing the distribution of lag 1 autocorrelation across stations for each of our two focal copepod species, along with time series plots for the stations that had significant autocorrelation at this lag:

There is higher temporal autocorrelation at the scale of individual stations for Calanus pacificus, which aligns with our coastwide results. Given the relatively short generation times of these copepods, we might expect that the temporal autocorrelation of copepod abundance is driven by the persistence of oceanographic conditions, such as the consecutive years of high abundance during the Blob.


Spatial structure

To investigate spatial autocorrelation, we will calculate two metrics: Semivariance and Moran’s I. Moran’s I is a measure of the overall clustering of the spatial data and tests if there is support to reject the null hypothesis of no spatial structure. Semivariance, visualized using a semivariogram, allows us to examine how spatial autocorrelation decays with increasing distance. In a semivariogram, high spatial autocorrelation appears as as a clear slope that then reaches a plateau at the distance at which there is no spatial autocorrelation, as seen in this image from Wikipedia:

An example Variogram.
An example Variogram.


To calculate the semivariance, we compute the variance of the difference between values (in our case, the log density) for different distances between samples. It is given by the following formula:

\[ \gamma(h) = \frac{1}{2} \mathrm{Var}( \log(x_i) - \log(x_{i+h}) ) \] Where \(h\) is the distance between two points, \(x_i\) is the value of the log density at one location and \(x_{i+h}\) is value of the log density at a location \(h\) distance away. Multiplying by \(\frac{1}{2}\) accounts for the fact that the formula accounts for the variance arising at both points.

From the semivariance, we can then construct a semivariogram, which depicts the spatial autocorrelation of samples. A semivariogram takes the semivariance calculated for each pair of points and summarizes it by taking the mean value across each bin.

Given that we did not see much evidence for temporal autocorrelation, we will examine the evidence for spatial autocorrelation on a year by year basis. As such, we will construct a separate semivariogram and calculate Moran’s I separately for each year. To summarize the Moran’s I results, we show the p-value for Moran’s I for each year, with the blue dashed line showing a p-value of 0.05.

We will first examine spatial autocorrelation in Calanus marshallae:



We will next examine spatial autocorrelation in Calanus pacificus:

Based on the Moran’s I results, we see that there is evidence for spatial clustering in some years but not others. However, there is not a clear slope to the semivariogram, which indicates that there is not evidence for spatial autocorrelation at our sampling resolution. The spread of the points also indicates that our data is quite noisy. These exploratory results suggest that copepods are quite patchy, with these patches occurring on a scale less than the scale of the sampling resolution (which is 5 nm along a single transect orthogonal to the coast).